-
Notifications
You must be signed in to change notification settings - Fork 790
[UR][Benchmarks] Support for Unitrace for all benchmarks #19320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: sycl
Are you sure you want to change the base?
Conversation
Signed-off-by: Mateusz P. Nowak <mateusz.p.nowak@intel.com>
Signed-off-by: Mateusz P. Nowak <mateusz.p.nowak@intel.com>
Signed-off-by: Mateusz P. Nowak <mateusz.p.nowak@intel.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Adds support for Unitrace tracing in all benchmarks by introducing new CLI flags, integrating Unitrace setup/teardown logic, and extending benchmark run methods.
- Introduce
--unitrace
and--unitrace-inclusive
flags and related options - Add
unitrace.py
utilities (prepare, cleanup, handle, prune) - Update
run_bench
and individual benchmarks to acceptunitrace_timestamp
and skip exclusions
Reviewed Changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 4 comments.
Show a summary per file
File | Description |
---|---|
devops/scripts/benchmarks/utils/unitrace.py | New module for Unitrace log handling |
devops/scripts/benchmarks/options.py | Added CLI options for Unitrace and save_name |
devops/scripts/benchmarks/main.py | Integrated Unitrace flags into workflow and CLI |
devops/scripts/benchmarks/html/data.js | Removed trailing whitespace |
devops/scripts/benchmarks/history.py | Updated save signature to include timestamp |
devops/scripts/benchmarks/benches/velocity.py | Extended run to forward unitrace_timestamp |
devops/scripts/benchmarks/benches/umf.py | Extended run to forward unitrace_timestamp |
devops/scripts/benchmarks/benches/test.py | Extended run to forward unitrace_timestamp |
devops/scripts/benchmarks/benches/syclbench.py | Extended run to forward unitrace_timestamp |
devops/scripts/benchmarks/benches/llamacpp.py | Extended run to forward unitrace_timestamp |
devops/scripts/benchmarks/benches/gromacs.py | Extended run to forward unitrace_timestamp |
devops/scripts/benchmarks/benches/compute.py | Added Unitrace exclusion list and timestamp logic |
devops/scripts/benchmarks/benches/benchdnn_list.py | Defined unitrace_exclusion_list for oneDNN bench |
devops/scripts/benchmarks/benches/benchdnn.py | Integrated Unitrace options for oneDNN bench |
devops/scripts/benchmarks/benches/base.py | Updated run_bench signature to add Unitrace hooks |
Comments suppressed due to low confidence (2)
devops/scripts/benchmarks/utils/unitrace.py:26
- [nitpick] Consider adding unit tests for
prune_unitrace_dirs
(and other helper functions) to verify that old directories are removed correctly and edge cases are handled.
def prune_unitrace_dirs(base_dir, FILECNT=10):
devops/scripts/benchmarks/benches/compute.py:229
- [nitpick] This exclusion list is duplicated in
benchdnn_list.py
; consider centralizing it in one module to avoid inconsistencies.
unitrace_exclusion_list = [
devops/scripts/benchmarks/main.py
Outdated
from utils.validate import Validate | ||
from utils.detect_versions import DetectVersion | ||
from presets import enabled_suites, presets | ||
from utils.oneapi import get_oneapi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The import get_oneapi
is not used in this file; consider removing it to avoid unused imports.
from utils.oneapi import get_oneapi |
Copilot uses AI. Check for mistakes.
bench_dir = f"{options.unitrace_res_dir}/{options.save_name}_{unitrace_timestamp}" | ||
os.makedirs(bench_dir, exist_ok=True) | ||
|
||
unitrace_output = f"{bench_dir}/{name}_{unitrace_timestamp}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Constructing file paths with hardcoded /
can break on some platforms; consider using os.path.join(options.unitrace_res_dir, f"{options.save_name}_{unitrace_timestamp}")
for portability.
bench_dir = f"{options.unitrace_res_dir}/{options.save_name}_{unitrace_timestamp}" | |
os.makedirs(bench_dir, exist_ok=True) | |
unitrace_output = f"{bench_dir}/{name}_{unitrace_timestamp}" | |
bench_dir = os.path.join(options.unitrace_res_dir, f"{options.save_name}_{unitrace_timestamp}") | |
os.makedirs(bench_dir, exist_ok=True) | |
unitrace_output = os.path.join(bench_dir, f"{name}_{unitrace_timestamp}") |
Copilot uses AI. Check for mistakes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
devops/scripts/benchmarks/main.py
Outdated
parser.add_argument( | ||
"--unitrace", | ||
action="store_true", | ||
help="Unitrace tracing for sigle iteration of benchmarks", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in help text: "sigle" should be "single".
help="Unitrace tracing for sigle iteration of benchmarks", | |
help="Unitrace tracing for single iteration of benchmarks", |
Copilot uses AI. Check for mistakes.
@@ -86,7 +87,14 @@ def get_adapter_full_path(): | |||
), f"could not find adapter file {adapter_path} (and in similar lib paths)" | |||
|
|||
def run_bench( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Avoid mutable default arguments like extra_unitrace_opt=[]
; use None
and set to an empty list inside the function to prevent unintended sharing.
Copilot uses AI. Check for mistakes.
"-DBUILD_WITH_XPTI=1", | ||
"-DBUILD_WITH_MPI=0", | ||
], | ||
ld_library=get_oneapi().ld_libraries() + [f"{options.sycl}/lib"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 removed
@@ -0,0 +1,182 @@ | |||
# Copyright (C) 2024-2025 Intel Corporation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2025
print(f"Moved {pid_json_files[0]} to {dst}") | ||
|
||
# Prune old unitrace directories | ||
prune_unitrace_dirs(options.unitrace_res_dir, FILECNT=5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You use this only here. Why bother with defining a default value of an argument that you are going to override anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debugging leftover, removed
|
||
shutil.move(os.path.join(options.benchmark_cwd, pid_json_files[0]), dst) | ||
if options.verbose: | ||
print(f"Moved {pid_json_files[0]} to {dst}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all the logs will need to be changed after #19158 is merged.
return bench_dir, unitrace_output, unitrace_command | ||
|
||
|
||
def handle_unitrace_output(bench_dir, unitrace_output, timestamp): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this is all very complicated. Maybe we can use directory structure to help us?
Let's say we have two benchmarks: "SubmitKernel" and "Hashtable". We can organize the unitrace output such that each trace file is in its own directory:
.../results/traces/SubmitKernel/YYYYMMDD_HHMMSS_[name].PID.out
.../results/traces/Hashtable/YYYYMMDD_HHMMSS_[name].PID.out
Removing old results would become simply removing all but 5 newest files in a directory. We wouldn't have to find, rename and move files, since we could make unitrace output them to the destination place upfront.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, various approaches are possible, no problem to follow yours.
devops/scripts/benchmarks/main.py
Outdated
parser.add_argument( | ||
"--unitrace", | ||
action="store_true", | ||
help="Unitrace tracing for sigle iteration of benchmarks", | ||
) | ||
|
||
parser.add_argument( | ||
"--unitrace-inclusive", | ||
action="store_true", | ||
help="Regular run of benchmarks iterations and unitrace tracing in single additional run", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please do --unitrace inclusive
and --unitrace
as a single option, not two separate ones. See how --output-html
is done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -224,6 +224,71 @@ def parse_unit_type(compute_unit): | |||
|
|||
|
|||
class ComputeBenchmark(Benchmark): | |||
|
|||
# list of benchmarks to exclude from unitrace due to SIGSEGV, SIGABRT or timeouts | |||
unitrace_exclusion_list = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should just let things fail. If unitrace is failing, it's an urgent bug that needs to be fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, ok.
Signed-off-by: Mateusz P. Nowak <mateusz.p.nowak@intel.com>
print(f"Renamed {src} to {dst}") | ||
break | ||
|
||
# Handle {name}.{pid}.json files in cwd: move and rename to {self.name()}_{timestamp}.json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't unitrace
already have an option for changing the default name of jsons and .pid
logs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unitrace produces two kinds of output files - traces, configurable with --output (pid is added anyway), and jsons for visualization in Perfetto UI, created with no control over their names - just ..json in cwd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bummer. Maybe we could change unitrace to use the output name for both outputs. The visualization json could be something like name.pid.vis.json
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One general comment - when under tracing, we should reduce the number of iterations etc, to shorten the duration of the benchmark.
If we really need to disable a benchmark (because it's difficult to make it sufficiently short for example), the Benchmark class should have something like traceable()
to filter out which benchmarks should / shouldn't be traced.
Added in Benchmark, and a stub in ComputeBenchmark |
Signed-off-by: Mateusz P. Nowak <mateusz.p.nowak@intel.com>
Signed-off-by: Mateusz P. Nowak <mateusz.p.nowak@intel.com>
Run main.py script with --unitrace to get unitrace logs for every benchmark.
Use --unitrace-inclusive to get benchmark results and unitrace logs.